首页> 外文OA文献 >Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads
【2h】

Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads

机译:大数据系统中的交互式分析处理:跨行业   mapReduce工作量的研究

摘要

Within the past few years, organizations in diverse industries have adoptedMapReduce-based systems for large-scale data processing. Along with these newusers, important new workloads have emerged which feature many small, short,and increasingly interactive jobs in addition to the large, long-running batchjobs for which MapReduce was originally designed. As interactive, large-scalequery processing is a strength of the RDBMS community, it is important thatlessons from that field be carried over and applied where possible in this newdomain. However, these new workloads have not yet been described in theliterature. We fill this gap with an empirical analysis of MapReduce tracesfrom six separate business-critical deployments inside Facebook and at Clouderacustomers in e-commerce, telecommunications, media, and retail. Our keycontribution is a characterization of new MapReduce workloads which are drivenin part by interactive analysis, and which make heavy use of query-likeprogramming frameworks on top of MapReduce. These workloads display diversebehaviors which invalidate prior assumptions about MapReduce such as uniformdata access, regular diurnal patterns, and prevalence of large jobs. Asecondary contribution is a first step towards creating a TPC-like dataprocessing benchmark for MapReduce.
机译:在过去的几年中,各行各业的组织都采用了基于MapReduce的系统来进行大规模数据处理。除了这些新用户之外,还出现了重要的新工作负载,这些工作负载除了最初设计MapReduce的大型,长期运行的批处理作业外,还具有许多小型,短期和日益交互的作业。由于交互式大规模查询处理是RDBMS社区的强项,因此重要的是,应尽可能在新领域中继承和应用该领域的经验教训。但是,这些新的工作负载尚未在文献中进行描述。我们通过对来自Facebook内部以及电子商务,电信,媒体和零售领域Clouderacustomers上六个独立的关键业务部署的MapReduce迹线进行实证分析来填补这一空白。我们的主要贡献是表征新的MapReduce工作负载,这些工作负载部分是由交互式分析驱动的,并且在MapReduce之上大量使用了类似查询的编程框架。这些工作负载显示出各种各样的行为,这些行为使先前关于MapReduce的假设无效,例如统一数据访问,规则的昼夜模式和大型工作的普遍性。二次贡献是为MapReduce创建类似TPC的数据处理基准的第一步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号